In [ ]:
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

Overview

AI Platform Online Prediction now supports custom python code in to apply custom prediction routines, including custom (stateful) pre/post processing, and/or models not created by the standard supported frameworks (TensorFlow, Keras, Scikit-learn, XGBoost).

Dataset

We use the Iris dataset

Objective

In this notebook, we show how to deploy a model created by PyTorch using AI Platform Custom Prediction Code using Iris dataset for a multi-class classification problem.

Costs

This tutorial uses billable components of Google Cloud Platform (GCP):

  • Cloud AI Platform
  • Cloud Storage

Learn about Cloud AI Platform pricing and Cloud Storage pricing, and use the Pricing Calculator to generate a cost estimate based on your projected usage.

Set up your local development environment

If you are using Colab or AI Platform Notebooks, your environment already meets all the requirements to run this notebook. You can skip this step.

Otherwise, make sure your environment meets this notebook's requirements. You need the following:

  • The Google Cloud SDK
  • Git
  • Python 3
  • virtualenv
  • Jupyter notebook running in a virtual environment with Python 3

The Google Cloud guide to Setting up a Python development environment and the Jupyter installation guide provide detailed instructions for meeting these requirements. The following steps provide a condensed set of instructions:

  1. Install and initialize the Cloud SDK.

  2. Install Python 3.

  3. Install virtualenv and create a virtual environment that uses Python 3.

  4. Activate that environment and run pip install jupyter in a shell to install Jupyter.

  5. Run jupyter notebook in a shell to launch Jupyter.

  6. Open this notebook in the Jupyter Notebook Dashboard.

Set up your GCP project

The following steps are required, regardless of your notebook environment.

  1. Select or create a GCP project.. When you first create an account, you get a $300 free credit towards your compute/storage costs.

  2. Make sure that billing is enabled for your project.

  3. Enable the AI Platform APIs and Compute Engine APIs.

  4. Enter your project ID in the cell below. Then run the cell to make sure the Cloud SDK uses the right project for all the commands in this notebook.

Note: Jupyter runs lines prefixed with ! as shell commands, and it interpolates Python variables prefixed with $ into these commands.

Authenticate your GCP account

If you are using AI Platform Notebooks, your environment is already authenticated. Skip this step.

If you are using Colab, run the cell below and follow the instructions when prompted to authenticate your account via oAuth.

Otherwise, follow these steps:

  1. In the GCP Console, go to the Create service account key page.

  2. From the Service account drop-down list, select New service account.

  3. In the Service account name field, enter a name.

  4. From the Role drop-down list, select Machine Learning Engine > AI Platform Admin and Storage > Storage Object Admin.

  5. Click Create. A JSON file that contains your key downloads to your local environment.

  6. Enter the path to your service account key as the GOOGLE_APPLICATION_CREDENTIALS variable in the cell below and run the cell.


In [ ]:
import sys

# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

if 'google.colab' in sys.modules:
  from google.colab import auth as google_auth
  google_auth.authenticate_user()

# If you are running this notebook locally, replace the string below with the
# path to your service account key and run this cell to authenticate your GCP
# account.
else:
  %env GOOGLE_APPLICATION_CREDENTIALS ''

PIP Install Packages and dependencies

Before we start let's install pytorch and gcloud


In [ ]:
!pip install torch --user

If you are running this notebook in Colab, run the following cell to authenticate your Google Cloud Platform user account


In [ ]:
PROJECT = '' # TODO (Set to your GCP Project name)
BUCKET = '' # TODO (Set to your GCS Bucket name)

In [ ]:
!gcloud config set project {PROJECT}
!gcloud config get-value project

3. Download iris data

In this example, we want to build a classifier for the simple iris dataset. So first, we download the data csv file locally.


In [ ]:
!mkdir data
!mkdir models

In [ ]:
LOCAL_DATA_DIR = "data/iris.csv"

In [ ]:
from urllib.request import urlretrieve

urlretrieve("https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data", LOCAL_DATA_DIR)

Part 1: Build a PyTorch NN Classifier

Make sure that pytorch package is installed.


In [ ]:
import torch
from torch.autograd import Variable

print('PyTorch Version: {}'.format(torch.__version__))

1. Load Data

In this step, we are going to:

  1. Load the data to Pandas Dataframe.
  2. Convert the class feature (species) from string to a numeric indicator.
  3. Split the Dataframe into input feature (xtrain) and target feature (ytrain).

In [ ]:
import pandas as pd

CLASS_VOCAB = ['setosa', 'versicolor', 'virginica']

datatrain = pd.read_csv(LOCAL_DATA_DIR, names=['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'])

#change string value to numeric
datatrain.loc[datatrain['species']=='Iris-setosa', 'species']=0
datatrain.loc[datatrain['species']=='Iris-versicolor', 'species']=1
datatrain.loc[datatrain['species']=='Iris-virginica', 'species']=2
datatrain = datatrain.apply(pd.to_numeric)

#change dataframe to array
datatrain_array = datatrain.as_matrix()

#split x and y (feature and target)
xtrain = datatrain_array[:,:4]
ytrain = datatrain_array[:,4]

input_features = xtrain.shape[1]
num_classes = len(CLASS_VOCAB)

print('Records loaded: {}'.format(len(xtrain)))
print('Number of input features: {}'.format(input_features))
print('Number of classes: {}'.format(num_classes))

2. Set model parameters

You can try different values for hidden_units or learning_rate.


In [ ]:
HIDDEN_UNITS = 10
LEARNING_RATE = 0.1

3. Define the PyTorch NN model

Here, we build a a neural network with one hidden layer, and a Softmax output layer for classification.


In [ ]:
model = torch.nn.Sequential(
    torch.nn.Linear(input_features, HIDDEN_UNITS),
    torch.nn.Sigmoid(),
    torch.nn.Linear(HIDDEN_UNITS, num_classes),
    torch.nn.Softmax()
)

loss_metric = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(),lr=LEARNING_RATE)

4. Train the model

We are going to train the model for num_epoch epochs.


In [ ]:
NUM_EPOCHS = 10000

for epoch in range(NUM_EPOCHS):
    
    x = Variable(torch.Tensor(xtrain).float())
    y = Variable(torch.Tensor(ytrain).long())
    optimizer.zero_grad()
    y_pred = model(x)
    loss = loss_metric(y_pred, y)
    loss.backward()
    optimizer.step()
    if (epoch) % 1000 == 0:
        print('Epoch [{}/{}] Loss: {}'.format(epoch+1, NUM_EPOCHS, round(loss.item(),3)))
        
print('Epoch [{}/{}] Loss: {}'.format(epoch+1, NUM_EPOCHS, round(loss.item(),3)))

5. Save and load the model


In [ ]:
LOCAL_MODEL_DIR = "models/model.pt"


torch.save(model, LOCAL_MODEL_DIR)
iris_classifier = torch.load(LOCAL_MODEL_DIR)

6. Test the loaded model for predictions


In [ ]:
def predict_class(instances):
    instances = torch.Tensor(instances)
    output = iris_classifier(instances)
    _ , predicted = torch.max(output, 1)
    return predicted

Get predictions for the first 5 instances in the dataset


In [ ]:
predicted = predict_class(xtrain[0:5])
print([CLASS_VOCAB[class_index] for class_index in predicted])

Get the classification accuracy on the training data


In [ ]:
import numpy as np

accuracy = round(sum(np.array(predict_class(xtrain)) == ytrain)/float(len(ytrain))*100,2)
print('Classification accuracy: {} %'.format(accuracy))

7. Upload trained model to Cloud Storage


In [ ]:
GCS_MODEL_DIR='models/pytorch/iris_classifier/'

!gsutil -m cp -r {LOCAL_MODEL_DIR} gs://{BUCKET}/{GCS_MODEL_DIR}
!gsutil ls gs://{BUCKET}/{GCS_MODEL_DIR}

Part 2: Prepare the Custom Prediction Package

  1. Implement a model custom class for pre/post processing, as well as loading and using your model for prediction.
  2. Prepare yout setup.py file, to include all the modules and packages you need in your custome model class.

1. Create the custom model class

In the from_path, you load the pytorch model that you uploaded to GCS. Then in the predict method, you use it for prediction.


In [ ]:
%%writefile model.py

import os
import pandas as pd
from google.cloud import storage
import torch

class PyTorchIrisClassifier(object):
    
    def __init__(self, model):
        self._model = model
        self.class_vocab = ['setosa', 'versicolor', 'virginica']
        
    @classmethod
    def from_path(cls, model_dir):
        model_file = os.path.join(model_dir,'model.pt')
        model = torch.load(model_file)    
        return cls(model)

    def predict(self, instances, **kwargs):
        data = pd.DataFrame(instances).as_matrix()
        inputs = torch.Tensor(data)
        outputs = self._model(inputs)
        _ , predicted = torch.max(outputs, 1)
        return [self.class_vocab[class_index] for class_index in predicted]

2. Create a setup.py module

Do not include pytorch as a required package, as well as the model.py file that includes your custom model class. We will include it when creating the model below.


In [ ]:
%%writefile setup.py

from setuptools import setup

REQUIRED_PACKAGES = []

setup(
    name="iris-custom-model",
    version="0.1",
    scripts=["model.py"],
    install_requires=REQUIRED_PACKAGES
)

3. Create the package

This will create a .tar.gz package under /dist directory. The name of the package will be (name)-(version).tar.gz where (name) and (version) are the ones specified in the setup.py.


In [ ]:
!python setup.py sdist

4. Uploaded the package to GCS


In [ ]:
GCS_PACKAGE_URI='models/pytorch/packages/iris-custom-model-0.1.tar.gz'

!gsutil cp ./dist/iris-custom-model-0.1.tar.gz gs://{BUCKET}/{GCS_PACKAGE_URI}
!gsutil ls gs://{BUCKET}/{GCS_PACKAGE_URI}

Part 3: Deploy the Model to AI Platform for Online Predictions

1. Create AI Platform model


In [ ]:
MODEL_NAME='torch_iris_classifier'
REGION = 'us-central1'

In [ ]:
# You can uncomment to enable logging
!gcloud ai-platform models create {MODEL_NAME} --regions {REGION} #--enable-logging --enable-console-logging
!gcloud ai-platform models list | grep 'torch'

2. Create AI Platform model version

Once you have your custom package ready, you can specify this as an argument when creating a version resource. Note that you need to provide the path to your package (as package-uris) and also the class name that contains your custom predict method (as model-class).

Pytorch compatible packages

You need to use compiled packages compatible with Cloud AI Platform Package information here

This bucket containers compiled packages for PyTorch that are compatible with Cloud AI Platform prediction. The files are mirroed from the official builds at https://download.pytorch.org/whl/cpu/torch_stable.html

In order to deploy a PyTorch model on Cloud AI Platform Online Predictions, you must add one of these packages to the packageURIs field on the version you deploy. Pick the package matching your Python and PyTorch version. The package names follow this template:

Package name = torch-{TORCH_VERSION_NUMBER}-{PYTHON_VERSION}-linux_x86_64.whl where PYTHON_VERSION = cp35-cp35m for Python 3 with runtime versions < 1.15, cp37-cp37m for Python 3 with runtime versions >= 1.15

Use cp27-cp27mu for Python 2.

For example, if I were to deploy a PyTorch model based on PyTorch 1.1.0 and Python 3, my gcloud command would look like:

gcloud beta ai-platform versions create {VERSION_NAME} --model {MODEL_NAME} \ ... --package-uris=gs://{MY_PACKAGE_BUCKET}/my_package-0.1.tar.gz,gs://cloud-ai-pytorch/torch-1.1.0-cp35-cp35m-linux_x86_64.whl


In [ ]:
MODEL_VERSION='v3'
RUNTIME_VERSION='1.15'
MODEL_CLASS='model.PyTorchIrisClassifier'

!gcloud beta ai-platform versions create {MODEL_VERSION} --model={MODEL_NAME} \
            --origin=gs://{BUCKET}/{GCS_MODEL_DIR} \
            --python-version=3.7 \
            --runtime-version={RUNTIME_VERSION} \
            --machine-type=mls1-c4-m4 \
            --package-uris=gs://{BUCKET}/{GCS_PACKAGE_URI},gs://cloud-ai-pytorch/torch-1.3.1+cpu-cp37-cp37m-linux_x86_64.whl \
            --prediction-class={MODEL_CLASS}

In [ ]:
!gcloud ai-platform versions list --model {MODEL_NAME}

Part 4: AI Platform Online Prediction


In [ ]:
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials

credentials = GoogleCredentials.get_application_default()
api = discovery.build('ml', 'v1', credentials=credentials,
                      discoveryServiceUrl='https://storage.googleapis.com/cloud-ml/discovery/ml_v1_discovery.json')


def estimate(project, model_name, version, instances):
    request_data = {'instances': instances}
    model_url = 'projects/{}/models/{}/versions/{}'.format(project, model_name, version)
    response = api.projects().predict(body=request_data, name=model_url).execute()

    #print response
    
    predictions = response["predictions"]
    return predictions

In [ ]:
instances = [
    [6.8, 2.8, 4.8, 1.4],
    [6. , 3.4, 4.5, 1.6]
]

predictions = estimate(instances=instances
                     ,project=PROJECT
                     ,model_name=MODEL_NAME
                     ,version=MODEL_VERSION)

print(predictions)

Questions? Feedback?

Feel free to send us an email (cloudml-feedback@google.com) if you run into any issues or have any questions/feedback!